1) Sampling your system

How are different fish species distributed in different locations of interest?

Species counts would be a variable to sample. Counting the number of individuals of the chosen species that are in each location. A 5x5 grid of plots (1m x 1m) could be set up in each location and numbered #1-25. 10 plots would be randomly chosen using a random number generator to avoid any bias or confounding variable. (For example choosing plots next to eachother at every location may seem the most convenient out in the field but defeats the purpose of selecting random plots across a larger grid area. This ensures that any variation in a population is represented by sampling random plots throughout a location. Data should be recorded by at least two different observers at a time. This helps reduce the chance of missing observations if one observer is distracted, and helps ensure accurate counts particularly of schooling fish. This data would most likely be Poisson distributed, since the observations were recorded as counts (all data will be positive integers).


2) Let’s get philosophical.

I believe I will inevitably become a Bayesian after I have practiced and become more familiar with the methods. I didn’t even know there were other techniques outside of the strictly frequentist techniques I learned in school. Additionally, the more I learn about “frequentist alternatives” such as, Bayesian and likelihood approaches, the more I learn how little I really understood what I was doing all the years I looked at nothing but p-values.

Bayesian methods seem more intuitive to me than others. More importantly, I don’t know how any biologist could fail to appreciate Bayesian statistics for allowing us to calculate variation with parameter estimates. When studying biological processes there will never be one “true” value when life depends so heavily on variation to exist. A Bayesian 95% CI provides a range in which 95% of the parameter values are contained instead of a range in which there is a 95% chance that the single “true” value is contained. The Bayesian techniques may get you roughly the same number as frequentist and likelihood methods, but the interpretation is more informative than the others. My favorite aspect of Bayesian methods is the prior. When starting from scratch a weak prior can be used to analyze just the data (assuming such a weak prior will essentially have no significant influence on the posterior). Human beings learn by making connections between past and new experiences.


3) Power

Simulate how these properties alter power of an F-test from linear regression using differnt alpha levels
Sample Size Intercept *Slope

set.seed(607) 


###### SAMPLE SIZE, SLOPE ######

# simulation data frame with the parameters and information that varies (baseline parameters from seal data)
simPopN <- data.frame(slope = seq(0, 0.005, 0.001),
                      sigma = 5.6805) %>%
  crossing(n=seq(5, 65, 10))


######### INTERCEPT #########
simPopN <- simPopN %>%
  crossing(intercept = seq(95, 135, 10))

##### (varying alpha levels) ####
simPopN <- simPopN %>%
  crossing(alpha = c(0.001, 0.005, 0.01, 0.05, 0.1, 0.15))

# expand to have a certain number of simulations for each sample size
simPopN <- simPopN %>%
  group_by(slope, intercept, sigma, n, alpha) %>%
  expand(reps = 1:n) %>%
  ungroup()

# simulate each a # of times
simPopN <- simPopN %>%
  crossing(sim = 1:500)

# simulate data
#(add in fitted values, simulate random draws of ages)
simPopN <- simPopN %>%
  mutate(age.days = runif(n(), 958, 8353)) %>%
  mutate(length.cm = rnorm(n(), intercept + slope*age.days, sigma))


## FIT MODELS AND EXTRACT COEFFICIENTS
fits <- simPopN %>%
  group_by(slope, intercept, sigma, n, sim, alpha) %>%
  nest()

# fit a model, get its coefficients using *broom*
fits <- fits %>%
    mutate(mod = map(data, ~lm(length.cm ~ age.days, data=.))) %>%
    mutate(coefs = map(mod, ~tidy(.)))

#cleanup - unnest, filter for slope coeff
fits <- fits %>%
  unnest(coefs) %>%
  ungroup() %>%
  filter(term == "age.days")

pow <- fits %>%
    group_by(n, slope, intercept, alpha) %>%
    summarise(power = 1-sum(p.value>alpha)/n()) %>%
    ungroup()



ggplot(data = pow, aes(x=slope, y=power, color = factor(alpha))) +
  geom_point() +
  geom_line() +
  theme_bw() +
  geom_hline(yintercept = 0.8) +
  facet_grid(intercept ~ n, labeller = label_both) +
  labs(title = "Power Analysis", x="Slope", y = "Power") +
  theme(axis.text.x = element_text(angle = 60, hjust = 1))


4) Quailing at the Prospect of Linear Models.

4.1 Three Fits

quails <- read.csv("~/Dropbox/Data_Science/MidTerm/Morphology data.csv")

# tidy up this column name nonsense
names(quails) <- c("Bird", "Sex", "Age.days", "Exp.Temp.C", "Mass.g", "Tarsus", "Culmen", "Depth", "Width", "Notes")

# no NAs, please
quails <- quails %>%
  filter(!is.na(Culmen)) %>%
  filter(!is.na(Tarsus))

# plot of how tarsus predicts culmen
quail_plot <- ggplot(quails, aes(x=Tarsus, y=Culmen)) +
  geom_point()

quail_plot + geom_point() +
  stat_smooth(method = "lm")

##########################################
# LEAST SQUARES
##########################################
quails_mod_lm <- lm(Culmen ~ Tarsus, data=quails)

# assumptions
plot(quails_mod_lm, which=1)

plot(quails_mod_lm, which=2)

##########################################
# LIKELIHOOD
##########################################

## Start values
lm_coefs <- coef(lm(Culmen ~ Tarsus, data=quails))

quails_mod_lik <- mle2(Culmen ~ dnorm(b0 + b1*Tarsus, resid_sd),
                       start = list(b0 = lm_coefs[1], b1 = lm_coefs[2], resid_sd = 4),
                       data = quails)
logLik(quails_mod_lik)
## 'log Lik.' -1249.682 (df=3)
#assumptions
quails_fit <- predict(quails_mod_lik)
quails_res <- residuals(quails_mod_lik)

qplot(quails_fit, quails_res)

qqnorm(quails_res)
qqline(quails_res)

#LRT test of model
quails_mod_lik_null <- mle2(Culmen ~ dnorm(b0 , resid_sd),
                start = list(b0 = 1, resid_sd = 4),
                data = quails)

anova(quails_mod_lik, quails_mod_lik_null)  
## Likelihood Ratio Tests
## Model 1: quails_mod_lik, Culmen~dnorm(b0+b1*Tarsus,resid_sd)
## Model 2: quails_mod_lik_null, Culmen~dnorm(b0,resid_sd)
##   Tot Df Deviance  Chisq Df Pr(>Chisq)    
## 1      3   2499.4                         
## 2      2   3750.6 1251.3  1  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
### The deviance values for the alternative and null hypotheses are far apart from eachother, confirming that the two are different.       



##########################################
# BAYESIAN
##########################################
#fit the model
quail_mod_bayes <- stan_glm(Culmen ~ Tarsus,
                data = quails, 
                family=gaussian())
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.13249 seconds (Warm-up)
##                1.02454 seconds (Sampling)
##                2.15703 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.17764 seconds (Warm-up)
##                1.0535 seconds (Sampling)
##                2.23114 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.39415 seconds (Warm-up)
##                1.14618 seconds (Sampling)
##                2.54033 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.29468 seconds (Warm-up)
##                1.17577 seconds (Sampling)
##                2.47044 seconds (Total)
# INSPECT CHAINS AND POSTERIORS
plot(quail_mod_bayes, plotfun = "stan_trace")

plot(quail_mod_bayes, show_density=TRUE)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)

# INSPECT AUTOCORRELATION
plot(quail_mod_bayes, plotfun = "stan_ac")

# MODEL ASSUMPTIONS
quail_fit <- predict(quail_mod_bayes)
quail_res <- residuals(quail_mod_bayes)

# FIT
qplot(quail_fit, quail_res)

pp_check(quail_mod_bayes, check="scatter")

# NORMALITY
qqnorm(quail_res)
qqline(quail_res)

pp_check(quail_mod_bayes, check="residuals", bins=8)

## MATCH TO POSTERIOR
pp_check(quail_mod_bayes, check="test", test=c("mean", "sd"))

pp_check(quail_mod_bayes, nreps = 10)

# COEFFS
summary(quail_mod_bayes, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = quails)
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 766
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.09720     0.20511    -0.50620    -0.23277    -0.09548
## Tarsus            0.37292     0.00631     0.36020     0.36882     0.37304
## sigma             1.23982     0.03083     1.18064     1.21914     1.23910
## mean_PPD         11.72976     0.06211    11.60769    11.68756    11.72982
## log-posterior -1259.61602     1.17219 -1262.72152 -1260.15899 -1259.30038
##                 75%         97.5%    
## (Intercept)       0.03639     0.31671
## Tarsus            0.37702     0.38529
## sigma             1.25984     1.30100
## mean_PPD         11.77197    11.85131
## log-posterior -1258.76279 -1258.31827
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00367 1.00220 3117 
## Tarsus        0.00011 1.00197 3293 
## sigma         0.00060 1.00046 2662 
## mean_PPD      0.00106 1.00027 3409 
## log-posterior 0.02524 1.00028 2156 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
# CIs
posterior_interval(quail_mod_bayes)
##                     5%       95%
## (Intercept) -0.4322149 0.2409565
## Tarsus       0.3623812 0.3833165
## sigma        1.1897560 1.2920638
# VISUALIZE.
quail_chains <- as.data.frame(quail_mod_bayes)

quail_plot2 <- quail_plot +
  geom_point() +
  theme_light(base_family = "Georgia") +
  geom_abline(intercept=quail_chains[,1], slope = quail_chains[,2], alpha=0.1, color="lightblue") +
  geom_abline(intercept=coef(quail_mod_bayes)[1], slope = coef(quail_mod_bayes)[2], color="steelblue4")

quail_plot2 +
  labs(title = "Quail Tarsus(leg) and Culmen(beak) Lengths", x='Tarsus(mm)', y='Culmen(mm)')


4.2 Three interpretations

#LEAST SQUARES
summary(quails_mod_lm)
## 
## Call:
## lm(formula = Culmen ~ Tarsus, data = quails)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.4081 -0.7029 -0.0328  0.7263  3.5970 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.098707   0.215450  -0.458    0.647    
## Tarsus       0.372927   0.006646  56.116   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.238 on 764 degrees of freedom
## Multiple R-squared:  0.8048, Adjusted R-squared:  0.8045 
## F-statistic:  3149 on 1 and 764 DF,  p-value: < 2.2e-16
#LIKELIHOOD
summary(quails_mod_lik)
## Maximum likelihood estimation
## 
## Call:
## mle2(minuslogl = Culmen ~ dnorm(b0 + b1 * Tarsus, resid_sd), 
##     start = list(b0 = lm_coefs[1], b1 = lm_coefs[2], resid_sd = 4), 
##     data = quails)
## 
## Coefficients:
##            Estimate Std. Error z value  Pr(z)    
## b0       -0.0987084  0.2151741 -0.4587 0.6464    
## b1        0.3729272  0.0066372 56.1878 <2e-16 ***
## resid_sd  1.2368003  0.0316001 39.1391 <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## -2 log L: 2499.363
anova(quails_mod_lik, quails_mod_lik_null)
## Likelihood Ratio Tests
## Model 1: quails_mod_lik, Culmen~dnorm(b0+b1*Tarsus,resid_sd)
## Model 2: quails_mod_lik_null, Culmen~dnorm(b0,resid_sd)
##   Tot Df Deviance  Chisq Df Pr(>Chisq)    
## 1      3   2499.4                         
## 2      2   3750.6 1251.3  1  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#BAYESIAN
summary(quail_mod_bayes, digits=5)   
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = quails)
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 766
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.09720     0.20511    -0.50620    -0.23277    -0.09548
## Tarsus            0.37292     0.00631     0.36020     0.36882     0.37304
## sigma             1.23982     0.03083     1.18064     1.21914     1.23910
## mean_PPD         11.72976     0.06211    11.60769    11.68756    11.72982
## log-posterior -1259.61602     1.17219 -1262.72152 -1260.15899 -1259.30038
##                 75%         97.5%    
## (Intercept)       0.03639     0.31671
## Tarsus            0.37702     0.38529
## sigma             1.25984     1.30100
## mean_PPD         11.77197    11.85131
## log-posterior -1258.76279 -1258.31827
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00367 1.00220 3117 
## Tarsus        0.00011 1.00197 3293 
## sigma         0.00060 1.00046 2662 
## mean_PPD      0.00106 1.00027 3409 
## log-posterior 0.02524 1.00028 2156 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
### The coeffeicients and errors values are very similar. All values are within 0.006 of eachother. 
### While the values may be close the results produced using each of the three techniques is interpreted slightly differently.   



##############################
# 95% CONFIDENCE INTERVALS:
##############################

# LEAST SQUARES
confint(quails_mod_lm)
##                  2.5 %    97.5 %
## (Intercept) -0.5216505 0.3242363
## Tarsus       0.3598809 0.3859727
# We are 95% confident that the true intercept value lies between -0.5216505 and 0.3242363.
# We are 95% confident that the true slope value lies between 0.3598809 and 0.3859727.

# LIKELIHOOD
confint(quails_mod_lik)
##               2.5 %    97.5 %
## b0       -0.5209627 0.3235484
## b1        0.3599021 0.3859515
## resid_sd  1.1773371 1.3013798
# The 95% confidence interval for intercept is -0.5209627 < b0 < 0.3235484.
# The 95% confidence interval for slope is 0.3599021 < b1 < 0.3859515

# BAYESIAN
posterior_interval(quail_mod_bayes, prob = 0.95)
##                   2.5%     97.5%
## (Intercept) -0.5062050 0.3167150
## Tarsus       0.3601978 0.3852893
## sigma        1.1806402 1.3009992
# We believe that 95% of the possible intercept values are between -0.5071446 and 0.3175814.
# We believe that 95% of the possible slope values are between 0.3599874 and 0.3853405.

4.3 Every Day I’m Profilin’

Generate the profile 95% and 75% confidence intervals by Brute Force for the slope and intercept from the Likelihood model. Note, to do this, you will need to refit some models and use the fixed argument with mle2 (e.g., fixed = c(b0 = 3). Check yourself against the confint results from the mle2 fit. The logLik function will help you out to extract log likelihoods from fit models. Plot the profiles along with reporting the values of the CIs.

set.seed(603)
confint(quails_mod_lik)
##               2.5 %    97.5 %
## b0       -0.5209627 0.3235484
## b1        0.3599021 0.3859515
## resid_sd  1.1773371 1.3013798
# MAX LOG LIKELIHOOD
peak <- logLik(quails_mod_lik)


CI_fun <- function(mod, peak, CI_range_value) {
  logLik_value <- logLik(mod)
  if(logLik_value >= (peak - CI_range_value) & logLik_value <= (peak + CI_range_value)){return("within_CI")}
  if(logLik_value < (peak - CI_range_value)){return("outside_CI")}
  if(logLik_value > (peak + CI_range_value)){return("outside_CI")}
}


logLik_fun <- function(int_val){
  int_mod_x <- mle2(Culmen ~ dnorm(b0 + b1*Tarsus, resid_sd),
                       start = list(b0 = lm_coefs[1], b1 = lm_coefs[2], resid_sd = 4), fixed = c(b0=int_val),
                       data = quails)
  return(logLik(int_mod_x))
}


logLik_fun2 <- function(slope_val){
  int_mod_x <- mle2(Culmen ~ dnorm(b0 + b1*Tarsus, resid_sd),
                       start = list(b0 = lm_coefs[1], b1 = lm_coefs[2], resid_sd = 4), fixed = c(b1=slope_val),
                       data = quails)
  return(logLik(int_mod_x))
}



##############################
# INTERCEPT VALUES
##############################
intercept_values <- seq(-0.6, 0.4, by = 0.1)

########## 95% CI ##########

CI_fun(logLik_fun(-0.6), peak, 1.92) #outside 95 CI
## [1] "outside_CI"
# within 95% CI
CI_fun(logLik_fun(-0.5), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(-0.4), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(-0.3), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(-0.2), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(-0.1), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(0.0), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(0.1), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(0.2), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(0.3), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun(0.4), peak, 1.92) #outside 95 CI
## [1] "outside_CI"
########## 75% CI ##########

CI_fun(logLik_fun(-0.5), peak, 0.66) # outside 75 CI
## [1] "outside_CI"
CI_fun(logLik_fun(-0.4), peak, 0.66) # outside 75 CI
## [1] "outside_CI"
# within 75% CI
CI_fun(logLik_fun(-0.3), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun(-0.2), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun(-0.1), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun(0.0), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun(0.1), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun(0.2), peak, 0.66) # outside 75 CI
## [1] "outside_CI"
CI_fun(logLik_fun(0.3), peak, 0.66) # outside 75 CI
## [1] "outside_CI"
# intercept and log lik values in a data frame
intercepts <- data.frame(intercept_val = c(seq(-0.8, 0.6, by = 0.1)),
                         log_likelihood = c(logLik_fun(-0.8),
                                            logLik_fun(-0.7),
                                            logLik_fun(-0.6),
                                            logLik_fun(-0.5),
                                            logLik_fun(-0.4),
                                            logLik_fun(-0.3),
                                            logLik_fun(-0.2),
                                            logLik_fun(-0.1),
                                            logLik_fun(0),
                                            logLik_fun(0.1),
                                            logLik_fun(0.2),
                                            logLik_fun(0.3),
                                            logLik_fun(0.4),
                                            logLik_fun(0.5),
                                            logLik_fun(0.6)
                                            ))

# VISUALIZE
ggplot(intercepts, aes(x=intercept_val, y=log_likelihood)) +
  geom_point() +
  stat_smooth() +
  geom_hline(yintercept = -1251.602) +
  geom_hline(yintercept = -1250.342)

##############################
# SLOPE VALUES
##############################

########## 95% CI ##########

CI_fun(logLik_fun2(0.3), peak, 1.92) # outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.31), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.32), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.33), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.34), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.35), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.36), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun2(0.37), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun2(0.38), peak, 1.92)
## [1] "within_CI"
CI_fun(logLik_fun2(0.39), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.4), peak, 1.92)# outside 95% CI
## [1] "outside_CI"
########## 75% CI ##########

CI_fun(logLik_fun2(0.36), peak, 0.66)# outside 75% CI
## [1] "outside_CI"
CI_fun(logLik_fun2(0.37), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun2(0.38), peak, 0.66)
## [1] "within_CI"
CI_fun(logLik_fun2(0.39), peak, 0.66)# outside 75% CI
## [1] "outside_CI"
# slope and log lik values in a data frame
slopes <- data.frame(slope_vals = c(seq(0.35, 0.4, by = 0.005)),
                         log_likelihood = c(logLik_fun2(0.35),
                                            logLik_fun2(0.355),
                                            logLik_fun2(0.36),
                                            logLik_fun2(0.365),
                                            logLik_fun2(0.37),
                                            logLik_fun2(0.375),
                                            logLik_fun2(0.38),
                                            logLik_fun2(0.385),
                                            logLik_fun2(0.39),
                                            logLik_fun2(0.395),
                                            logLik_fun2(0.4)
                                            ))

ggplot(slopes, aes(x=slope_vals, y=log_likelihood)) +
  geom_point() +
  stat_smooth() +
  geom_hline(yintercept = -1251.602) +
  geom_hline(yintercept = -1250.342)


4.4 The Power of the Prior

set.seed(603)
# FIT MODEL WITH SET PRIOR
quail_mod_bayes_prior <- stan_glm(Culmen ~ Tarsus,
                data = quails, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.37048 seconds (Warm-up)
##                1.08336 seconds (Sampling)
##                2.45384 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.21016 seconds (Warm-up)
##                1.09356 seconds (Sampling)
##                2.30373 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.07285 seconds (Warm-up)
##                1.05502 seconds (Sampling)
##                2.12787 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.24899 seconds (Warm-up)
##                1.13294 seconds (Sampling)
##                2.38193 seconds (Total)
summary(quail_mod_bayes_prior, digits = 5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = quails, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 766
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.71528     0.11641    -0.94704    -0.79293    -0.71584
## Tarsus            0.39236     0.00336     0.38580     0.39009     0.39234
## sigma             1.24689     0.03048     1.18880     1.22645     1.24626
## mean_PPD         11.72725     0.06262    11.60198    11.68550    11.72781
## log-posterior -1265.49767     1.16732 -1268.56628 -1265.96709 -1265.19300
##                 75%         97.5%    
## (Intercept)      -0.63334    -0.49244
## Tarsus            0.39455     0.39898
## sigma             1.26646     1.30830
## mean_PPD         11.77015    11.84777
## log-posterior -1264.66677 -1264.20231
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00202 0.99974 3327 
## Tarsus        0.00006 0.99969 3208 
## sigma         0.00053 0.99961 3265 
## mean_PPD      0.00105 0.99984 3536 
## log-posterior 0.02576 1.00079 2054 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#VISUALIZE fit still similar to weak prior
quail_chains_prior <- as.data.frame(quail_mod_bayes_prior)

quail_plot2 +
  #FLAT PRIOR *(already saved in plot2)*
  # geom_abline(intercept=quail_chains[,1], slope = quail_chains[,2], alpha=0.1, color="lightblue") +
  # geom_abline(intercept=coef(quail_mod_bayes)[1], slope = coef(quail_mod_bayes)[2], color="steelblue4") +
   #STRONG PRIOR
  geom_abline(intercept=quail_chains_prior[,1], slope = quail_chains_prior[,2], alpha=0.1, color="khaki1") +
  geom_abline(intercept=coef(quail_mod_bayes_prior)[1], slope = coef(quail_mod_bayes_prior)[2], color="khaki4") +
  geom_label(aes(label = "Flat Prior", x=18, y=9), color = "steelblue4", fill = "lightblue", alpha = 0.1, size = 4, label.size = 0, fontface = "italic") +
  geom_label(aes(label = "Strong Prior", x=22, y=5), color = "khaki4", fill = "khaki1", alpha=0.1, size = 4, label.size = 0, fontface = "italic") +
  labs(title = "Flat vs Strong Prior", x='Tarsus(mm)', y='Culmen(mm)')

##############################
# A VERY SMALL SAMPLE SIZE
##############################
sample_size_20 <- quails[sample(1:nrow(quails), 20,
    replace=FALSE),]

quail_mod_bayes_prior_20 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_20, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.062224 seconds (Warm-up)
##                0.0597 seconds (Sampling)
##                0.121924 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.058288 seconds (Warm-up)
##                0.053156 seconds (Sampling)
##                0.111444 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.065984 seconds (Warm-up)
##                0.060379 seconds (Sampling)
##                0.126363 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.058568 seconds (Warm-up)
##                0.060979 seconds (Sampling)
##                0.119547 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_20, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_20, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 20
## 
## Estimates:
##                 mean      sd        2.5%      25%       50%       75%    
## (Intercept)    -1.14366   0.35799  -1.83740  -1.37367  -1.14519  -0.90813
## Tarsus          0.39949   0.00355   0.39239   0.39715   0.39958   0.40177
## sigma           1.61406   0.26145   1.19422   1.43323   1.58352   1.76147
## mean_PPD       11.12019   0.50430  10.10790  10.79428  11.12194  11.44762
## log-posterior -45.83792   1.13153 -48.71766 -46.33265 -45.55740 -45.00280
##                 97.5%  
## (Intercept)    -0.44964
## Tarsus          0.40654
## sigma           2.23481
## mean_PPD       12.10374
## log-posterior -44.55164
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00635 1.00052 3174 
## Tarsus        0.00006 1.00064 3143 
## sigma         0.00514 1.00095 2583 
## mean_PPD      0.00840 1.00020 3602 
## log-posterior 0.02436 1.00035 2158 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### DOES include 0.4 in the 95% credible interval 


##############################
# TRY SAMPLE SIZES
##############################


###### N = 500 ##############
sample_size_500 <- quails[sample(1:nrow(quails), 500,
    replace=FALSE),]

quail_mod_bayes_prior_500 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_500, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.763637 seconds (Warm-up)
##                0.729837 seconds (Sampling)
##                1.49347 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.765847 seconds (Warm-up)
##                0.697704 seconds (Sampling)
##                1.46355 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.830668 seconds (Warm-up)
##                0.689755 seconds (Sampling)
##                1.52042 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.852573 seconds (Warm-up)
##                0.740622 seconds (Sampling)
##                1.5932 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_500, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_500, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 500
## 
## Estimates:
##                 mean       sd         2.5%       25%        50%     
## (Intercept)     -0.73022    0.12280   -0.97816   -0.81346   -0.72744
## Tarsus           0.39528    0.00348    0.38850    0.39294    0.39524
## sigma            1.23231    0.03758    1.16244    1.20776    1.23129
## mean_PPD        11.85166    0.07762   11.69521   11.80106   11.85433
## log-posterior -822.32920    1.14407 -825.41016 -822.80774 -822.01740
##                 75%        97.5%   
## (Intercept)     -0.64919   -0.48562
## Tarsus           0.39756    0.40244
## sigma            1.25729    1.30878
## mean_PPD        11.90283   12.00200
## log-posterior -821.51416 -821.07404
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00218 0.99972 3183 
## Tarsus        0.00006 0.99967 3103 
## sigma         0.00064 1.00053 3479 
## mean_PPD      0.00133 1.00026 3428 
## log-posterior 0.02342 0.99979 2387 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### sample size of 500 inclues 0.4


###### N = 600 ###############
sample_size_600 <- quails[sample(1:nrow(quails), 600,
    replace=FALSE),]

quail_mod_bayes_prior_600 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_600, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.995906 seconds (Warm-up)
##                0.871448 seconds (Sampling)
##                1.86735 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.959145 seconds (Warm-up)
##                0.855198 seconds (Sampling)
##                1.81434 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.942244 seconds (Warm-up)
##                0.863628 seconds (Sampling)
##                1.80587 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.984521 seconds (Warm-up)
##                0.94325 seconds (Sampling)
##                1.92777 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_600, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_600, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 600
## 
## Estimates:
##                 mean       sd         2.5%       25%        50%     
## (Intercept)     -0.72172    0.11828   -0.95447   -0.80129   -0.72132
## Tarsus           0.39366    0.00341    0.38688    0.39138    0.39363
## sigma            1.24085    0.03498    1.17423    1.21724    1.24006
## mean_PPD        11.70730    0.07042   11.56956   11.66080   11.70647
## log-posterior -989.78741    1.15359 -992.73581 -990.30853 -989.46911
##                 75%        97.5%   
## (Intercept)     -0.63924   -0.49284
## Tarsus           0.39593    0.40060
## sigma            1.26404    1.31182
## mean_PPD        11.75457   11.84543
## log-posterior -988.95915 -988.50870
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00199 1.00107 3530 
## Tarsus        0.00006 1.00110 3537 
## sigma         0.00059 0.99931 3516 
## mean_PPD      0.00116 0.99999 3677 
## log-posterior 0.02466 1.00290 2188 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### sample size of 600 includes 0.4


###### N = 650 ##############
sample_size_650 <- quails[sample(1:nrow(quails), 650,
    replace=FALSE),]

quail_mod_bayes_prior_650 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_650, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.04202 seconds (Warm-up)
##                0.94116 seconds (Sampling)
##                1.98318 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.03449 seconds (Warm-up)
##                0.914523 seconds (Sampling)
##                1.94901 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.98095 seconds (Warm-up)
##                0.851017 seconds (Sampling)
##                1.83197 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.0248 seconds (Warm-up)
##                0.970068 seconds (Sampling)
##                1.99486 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_650, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_650, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 650
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.77207     0.12255    -1.00961    -0.85616    -0.77013
## Tarsus            0.39356     0.00349     0.38678     0.39123     0.39358
## sigma             1.23149     0.03206     1.17123     1.20997     1.23032
## mean_PPD         11.77585     0.06821    11.64053    11.73095    11.77565
## log-posterior -1066.97889     1.14174 -1069.85256 -1067.50102 -1066.67161
##                 75%         97.5%    
## (Intercept)      -0.69015    -0.52998
## Tarsus            0.39595     0.40047
## sigma             1.25191     1.29746
## mean_PPD         11.82208    11.90975
## log-posterior -1066.15368 -1065.69411
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00216 0.99975 3230 
## Tarsus        0.00006 0.99971 3277 
## sigma         0.00054 0.99949 3500 
## mean_PPD      0.00119 1.00085 3300 
## log-posterior 0.02371 0.99929 2319 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### sample size of 650 STILL includes 0.4


###### N = 660 ##############
sample_size_660 <- quails[sample(1:nrow(quails), 660,
    replace=FALSE),]

quail_mod_bayes_prior_660 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_660, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.01724 seconds (Warm-up)
##                1.00965 seconds (Sampling)
##                2.02689 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.985571 seconds (Warm-up)
##                1.00877 seconds (Sampling)
##                1.99434 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.11469 seconds (Warm-up)
##                0.799284 seconds (Sampling)
##                1.91398 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.960847 seconds (Warm-up)
##                0.946781 seconds (Sampling)
##                1.90763 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_660, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_660, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 660
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.77755     0.12034    -1.01927    -0.85497    -0.77633
## Tarsus            0.39365     0.00350     0.38685     0.39138     0.39361
## sigma             1.24850     0.03328     1.18521     1.22620     1.24748
## mean_PPD         11.74623     0.06570    11.62106    11.70207    11.74509
## log-posterior -1092.49566     1.14233 -1095.50943 -1092.99692 -1092.16884
##                 75%         97.5%    
## (Intercept)      -0.69778    -0.53860
## Tarsus            0.39594     0.40063
## sigma             1.27132     1.31394
## mean_PPD         11.78959    11.87932
## log-posterior -1091.66219 -1091.21554
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00213 1.00080 3181 
## Tarsus        0.00006 1.00179 3177 
## sigma         0.00061 1.00069 3004 
## mean_PPD      0.00109 1.00055 3623 
## log-posterior 0.02433 1.00190 2204 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
### 660 STILL includes 0.4


###### N = 665 ##############
sample_size_665 <- quails[sample(1:nrow(quails), 665,
    replace=FALSE),]

quail_mod_bayes_prior_665 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_665, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.01017 seconds (Warm-up)
##                0.968697 seconds (Sampling)
##                1.97887 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.03954 seconds (Warm-up)
##                0.869813 seconds (Sampling)
##                1.90936 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.03281 seconds (Warm-up)
##                0.925706 seconds (Sampling)
##                1.95851 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.9995 seconds (Warm-up)
##                0.881804 seconds (Sampling)
##                1.8813 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_665, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_665, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 665
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.71015     0.11904    -0.94202    -0.78976    -0.71183
## Tarsus            0.39245     0.00347     0.38562     0.39012     0.39245
## sigma             1.24456     0.03251     1.18289     1.22267     1.24383
## mean_PPD         11.79340     0.06715    11.66235    11.74810    11.79291
## log-posterior -1098.44170     1.14189 -1101.54574 -1098.95572 -1098.14220
##                 75%         97.5%    
## (Intercept)      -0.63129    -0.47047
## Tarsus            0.39483     0.39919
## sigma             1.26573     1.31164
## mean_PPD         11.83667    11.92834
## log-posterior -1097.60861 -1097.17338
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00217 1.00163 2999 
## Tarsus        0.00006 1.00258 3128 
## sigma         0.00056 1.00213 3328 
## mean_PPD      0.00111 1.00018 3640 
## log-posterior 0.02256 1.00054 2563 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### sample size of 665 STILL include 0.4


###### N = 667 ##############
sample_size_667 <- quails[sample(1:nrow(quails), 667,
    replace=FALSE),]

quail_mod_bayes_prior_667 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_667, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.04057 seconds (Warm-up)
##                0.944468 seconds (Sampling)
##                1.98504 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.03509 seconds (Warm-up)
##                1.05464 seconds (Sampling)
##                2.08973 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.1194 seconds (Warm-up)
##                0.984064 seconds (Sampling)
##                2.10346 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.13166 seconds (Warm-up)
##                1.11262 seconds (Sampling)
##                2.24428 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_667, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_667, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 667
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.71906     0.12259    -0.96357    -0.79943    -0.72152
## Tarsus            0.39220     0.00354     0.38536     0.38983     0.39223
## sigma             1.25559     0.03310     1.19305     1.23339     1.25446
## mean_PPD         11.68940     0.06589    11.56089    11.64494    11.68831
## log-posterior -1107.81822     1.19305 -1110.94248 -1108.32195 -1107.50045
##                 75%         97.5%    
## (Intercept)      -0.63434    -0.47917
## Tarsus            0.39456     0.39912
## sigma             1.27730     1.32265
## mean_PPD         11.73290    11.82331
## log-posterior -1106.95961 -1106.52129
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00223 1.00094 3017 
## Tarsus        0.00006 1.00124 3415 
## sigma         0.00060 1.00004 3053 
## mean_PPD      0.00108 1.00052 3741 
## log-posterior 0.02545 1.00128 2197 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### sample size of 667 STILL includes 0.4


###### N = 668 ##############
sample_size_668 <- quails[sample(1:nrow(quails), 668,
    replace=FALSE),]

quail_mod_bayes_prior_668 <- stan_glm(Culmen ~ Tarsus,
                data = sample_size_668, 
                family=gaussian(),
                prior = normal(0.4,0.01))
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## 
## Chain 1, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 0.979297 seconds (Warm-up)
##                0.872519 seconds (Sampling)
##                1.85182 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## 
## Chain 2, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.15399 seconds (Warm-up)
##                1.00807 seconds (Sampling)
##                2.16206 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## 
## Chain 3, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.04145 seconds (Warm-up)
##                0.978625 seconds (Sampling)
##                2.02008 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## 
## Chain 4, Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4, Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4, Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4, Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4, Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4, Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4, Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4, Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4, Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4, Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4, Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4, Iteration: 2000 / 2000 [100%]  (Sampling)
##  Elapsed Time: 1.08234 seconds (Warm-up)
##                0.905152 seconds (Sampling)
##                1.98749 seconds (Total)
#coefficients
summary(quail_mod_bayes_prior_668, digits=5)
## stan_glm(formula = Culmen ~ Tarsus, family = gaussian(), data = sample_size_668, 
##     prior = normal(0.4, 0.01))
## 
## Family: gaussian (identity)
## Algorithm: sampling
## Posterior sample size: 4000
## Observations: 668
## 
## Estimates:
##                 mean        sd          2.5%        25%         50%      
## (Intercept)      -0.77279     0.11971    -1.01118    -0.85305    -0.77366
## Tarsus            0.39384     0.00345     0.38706     0.39153     0.39393
## sigma             1.22806     0.03246     1.16495     1.20629     1.22720
## mean_PPD         11.71073     0.06641    11.57790    11.66676    11.71007
## log-posterior -1094.14638     1.17294 -1097.21767 -1094.64298 -1093.82687
##                 75%         97.5%    
## (Intercept)      -0.69275    -0.53515
## Tarsus            0.39610     0.40048
## sigma             1.24933     1.29265
## mean_PPD         11.75466    11.83862
## log-posterior -1093.29565 -1092.84347
## 
## Diagnostics:
##               mcse    Rhat    n_eff
## (Intercept)   0.00213 1.00188 3162 
## Tarsus        0.00006 1.00150 3289 
## sigma         0.00056 1.00002 3414 
## mean_PPD      0.00112 0.99956 3491 
## log-posterior 0.02471 1.00129 2253 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
#### sample size of 668 does NOT include 0.4


######## A sample size of 667 starts to include 0.4 in the 95% confidence interval. A sample size of 668 does not.